Maximum Occurrence

Maximum Occurrence (MO) refers to a method for making rigorous numerical assessments about the maximum percent of time that a conformer of a flexible macromolecule can exist and still be compatible with the experimental data. Maximum Occurrence of a conformer is defined as the maximum weight that it can have in one ensemble that matches the average experimental data.

Background
Flexible proteins, even in the simplest case of two rigid domains linked by a flexible region, may sample a wide conformational space. Such variety makes their study by X-ray crystallography difficult, because a single conformation is trapped in the crystal, if crystals are obtained at all. Solution techniques such as Nuclear Magnetic Resonance (NMR) and Small-Angle Scattering of both X-rays and Neutrons (SAXS and SANS), provide experimental observables averaged over many conformation with different weights. The problem of recovering from the data themselves the conformational ensemble that generated the data is an ill-defined inverse problem, that admits an infinite number of solutions.

Popular methods for the determination of conformational disorder rely on the construction of ensembles that are solutions to this problem. Anyway, there is no proof that one solution can be better than another, so quantitative assessments are risky with these approaches.

The Maximum Occurrence is the maximum percent of time that a conformer of a macromolecule can exist and still be compatible with the experimental data (see Figure 1). In this case, the quantitative assessment is rigorous, since only one conformer of the ensemble is considered while the completing conformers are not artificially endowed of physical relevance that they may not have. Recently, an implementation of this method based on distributed computing was presented to evaluate the MO profiles for a large number of conformers : this represents an evolution of a previous approach where few conformations with maximum allowed probability (MAP) were looked for.

Methods for determination of Maximum Occurrence
As already mentioned, to evaluate the Maximum Occurrence, an ensemble is sought so that the conformation under investigation is contained up to a certain value. At the Maximum Occurrence, no solution will be found fulfilling both the requisites of containing the desired conformation and of being compatible with the experimental data. Such process is exemplified in figure 2: the four boxes represent 4 different ensembles, containing the desired conformation (represented as a red star) at different weight (represented by the dimension of the red star): the fourth one is no more compatible with the experimental data. The Maximum Occurrence approach has been developed using mainly paramagnetism-based NMR restraints and SAXS data, but it is well compatible with all the biophysical tools providing experimental data that are averages over all the states sampled by the system.

Case Study: Calmodulin
Calmodulin (in this case N60D Calmodulin (PDB ENTRIES 1sw8,2k0j,2k61)) is a two-domain protein experiencing high mobility in the central region. Paramagnetic NMR restraints as pseudocontact shifts (PCS) and self-orientation residual dipolar couplings (RDC) provided further insight in the description of such conformational heterogeneity. For this study, 400 conformers were chosen randomly, with the only requisite to be sterically allowed, and their maximum occurrence was evaluated against three sets of PCS (due to Tb3+,Dy3+ and Tm3+) and three sets of RDC (from the same metals), both measured at CERM (in Florence), and X-rays scattering data up to 2nm-1, as recoded on the X33 beamline at  EMBL, DESY, Hamburg. The results of the calculations are represented in figure 3: Orientation tensors, centered in the center of mass of the C-terminal domain, are shown color coded according to their MO value (from less than 5% in blue to more than 30% in red).